Skip to content

ClusterSearcher and RecoCluster#365

Merged
marc1uk merged 11 commits intoANNIEsoft:Applicationfrom
flemmons:Application
Mar 11, 2026
Merged

ClusterSearcher and RecoCluster#365
marc1uk merged 11 commits intoANNIEsoft:Applicationfrom
flemmons:Application

Conversation

@flemmons
Copy link
Contributor

-Updated DigitBuilder to be more consistent with and independent of ClusterFinder's corresponding functions
--Added some extra features, such as hit LAPPD count and strip-by-strip LAPPD hits. (some of these are considered experimental/preliminary)
-Added ClusterSearcher to replace ClusterFinder using RecoDigit and RecoCluster classes
-Updated RecoDigit and RecoCluster classes for corresponding use and new features, such as various cluster parameters
-Added NeutronCheck tool as output for RecoCluster information
-Added sample toolchain configfolder for using the new tools

…ClusterFinder's corresponding functions

--Added some extra features, such as hit LAPPD count and strip-by-strip LAPPD hits.
-Added ClusterSearcher to replace ClusterFinder using RecoDigit and RecoCluster classes
-Updated RecoDigit and RecoCluster classes for corresponding use and new features, such as various cluster parameters
-Added NeutronCheck tool as output for RecoCluster information
-Added sample toolchain configfolder for using the new tools
@jminock jminock self-requested a review October 30, 2025 13:03
@jminock jminock self-assigned this Oct 30, 2025
@S81D S81D self-assigned this Oct 30, 2025
//}
}
}
//FIXME: Need a method to have the 123 be equal to the number of operating detectors
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be fixed now?

LappdNumStrips 60 ## num channels to construct from each LAPPD
LappdStripLength 100 ## relative x position of each LAPPD strip, for dual-sided readout [mm]
LappdStripSeparation 10 ## stripline separation, for calculating relative y position of each LAPPD strip [mm]
PMTMask configfiles/BeamClusterAnalysisMC/DeadPMTIDs_p2v7.txt ## Which PMTs should be masked out? / are dead?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can point to the most up to date path configfiles/LoadWCSim/DeadPMTIDs_p2v7.txt

if (!fisMC){
int pmtid = recoDigit->GetDetectorID();
unsigned long chankey = pmt_tubeid_to_channelkey[pmtid];
if (pmt_gains[chankey]>0) qep/=pmt_gains[chankey];
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In line 77 (m_variables.Get("SinglePEGains",singlePEgains);) you grab the SPE gains from the gains file then use it to convert from nQ --> pe for data. It doesn't look like you ever specify the gains file in the Config file. Could you add it to ClusterSearcherConfig? Alternatively you can grab that directly from the Store since the LoadGeometry tool populates it for downstream tools (like this one) to use. See PhaseIITreeMaker for an example.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current configuration in the provided toolchain is for use on MC simulations, which do not require the conversion. So this variable is not necessary within the configuration. I will add it to the readme file.
Though as I start looking at Data in the next couple of weeks, I'll likely move into taking the gains from the store in the next update to this tool. Thank you for the suggestion.

if(fMCParticles->at(i).GetPdgCode()==2112 && fMCParticles->at(i).GetParentPdg()==0) {
fTrueNeutronMult++;

if(fMCParticles->at(i).GetStopTime()>10000) fTrueNeutronDelayed++;
Copy link
Collaborator

@S81D S81D Nov 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be wise to make this a configurable variable in case someone needs to look for neutrons in a different region of interest. Especially considering 10us is used extensively throughout the code.

//true_Emu*=1000; //GeV->MeV to match other energies(unneeded, possibly)

double theta = truevtx->GetDirection().GetTheta();
double p = sqrt(pow(true_Emu,2)-pow(105.7,2));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jminock is there a way to grab the momentum and Q^2 from the Store? I thought the LoadGenieEvent tool will store those values for use.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There absolutely is via: m_data->Stores["GenieInfo"]->Get("EventQ2",TrueQ2). Magnitude of muon momentum is not saved to the Store so this is the intended method to get the true muon momentum

Copy link
Collaborator

@jminock jminock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make the changes listed. I worry with the large number of pointers if memory is being handled properly. I don't know how necessary all of the pointers are, and I am not enough of an expert on it to make a certain statement. I recommend double checking all are fitting general best practice before this gets sent off to Level 0 review

}

if (!fisMC){
ifstream file_singlepe(singlePEgains.c_str());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

recommend adding a check file_singlepe.is_open() to catch typos in filename.

// ===================
for(int idigit=0; idigit<int(fSelectByNeighbours->size()); idigit++ ){
RecoDigit* recoDigit = (RecoDigit*)(fSelectByNeighbours->at(idigit));
RecoClusterDigit* clusterDigit = new RecoClusterDigit(recoDigit);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do the digits need to be on the heap?

if (verbosity > 3) std::cout << line << std::endl; //has our stuff;
if (line.find("#") != std::string::npos) continue;
std::vector<std::string> DataEntries;
boost::split(DataEntries, line, boost::is_any_of(","), boost::token_compress_on);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think token_compress_on merges repeated tokens? Is this desirable? If there are repeated tokens, doesn't this suggest an empty column, and by compressing that, your later columns will be shifted?

-Changed RecoCluster and RecoDigit lists in several tools to be vectors of objects, rather than vectors of pointers, to avoid memory complications
-removed unused convex hull function from RecoCluster class
-simplified CalcAS function in RecoCluster class by consolidating for loops
-removed several instances of commented-out code from older versions
-removed several debug outputs
-altered hard-coded time window values in several instances to rely on configuration input
-added use of the true Q2 value from the GenieInfo store to NeutronCheck's output.
-removed several uncontrolled cout lines, and replaced useful ones with Log.
-Tidied the Instance() function of ClusterSearcher
-Added vertex information to NeutronCheck
@jminock
Copy link
Collaborator

jminock commented Jan 13, 2026

Thank you @flemmons for the updates! Unfortunately, this branch has conflicts with the main branch that must be resolved in order to be merged. I will wait for the conflicts to be resolved and for the workflow check to be performed such that ToolAnalysis is confirmed to compile before I review the actual changes to the files

@flemmons
Copy link
Contributor Author

Conflicts were just Factory and Unity listing different added tools. I've cleared them.

@jminock
Copy link
Collaborator

jminock commented Jan 14, 2026

Okay, but could you still resolve those conflicts? Also, it's great that ToolAnalysis can compile in your workspace, but it would be better to get confirmation that ToolAnalysis can compile in a general workspace (on GitHub via workflow). Given that this PR is self contained, there is no reason to not resolve these conflicts. I will look at the other files in the meantime.

@jminock jminock added bug and removed Conflicts labels Jan 16, 2026
@jminock jminock removed the bug label Jan 26, 2026
//}
}
//FIXME: Need a method to have the 123 be equal to the number of operating detectors
double ucharge_balance = sqrt((total_QSquared) / (total_Q * total_Q) - (1. / 123.));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

123 is a magic number. While it doesn't impact functionality for now, it can be troublesome for documentation and updating it. What does 123 represent? What is an "operating detector"?

Copy link
Collaborator

@jminock jminock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please address all the comments. I left some that are more general questions that would be nice to have accessible documentation addressing, and shouldn't be an issue to create. There are some comments that ask about memory. Please address them as they do impact functionality

… nor memory-safe with recent changes to RecoDigit storage.
Copy link
Collaborator

@jminock jminock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Enough comments have been addressed such that any remaining comments are specific to this analysis. I think this is in good enough shape to merge so other neutron analyzers have access to these Tools to further their development.

@marc1uk
Copy link
Collaborator

marc1uk commented Mar 4, 2026

It would be helpful to 'resolve' those comments that have been addressed to make it clearer for reviewers what is still outstanding, without having to go back through the code again.

I'm not sure I agree that the remaining issues are all analysis-specific.

  • This comment suggests that the toolchain won't be writing out correct true q2 values, which seems important to me and could very well be difficult to track down if you're not aware of the reason.
  • RecoCluster::CalcCB() still seems to have a placeholder value of 123 with a comment //FIXME: Need a method to have the 123 be equal to the number of operating detectors - that should be simple enough to retrieve from the geometry (ANNIEGeometry::GetNumPMTs or similar). CalcCB is called by CalcParameters which is called by the ClusterSearcher Tool, so it seems like this information is not an irrelevant side-effect if incorrect.
    • (The same issue is also in HitCleaner::CBCheck, and while that seems less important as all it does is print things, my feeling is that if the code isn't functional it shouldn't be merged, so either delete CBCheck or correct it.)
  • this comment as mentioned seems like a bug that would omit a channel. I haven't run the code to check that, but nor is there any comment to say it doesn't. If it does, this is a bug I feel is worth fixing before merge.
  • I know we're about to merge Yue's code that has several memory leaks but we shouldn't make a habit of it. Please address this one.

@marc1uk
Copy link
Collaborator

marc1uk commented Mar 4, 2026

regarding this discussion on the output file from NeutronCheck, I'm inclined to agree with Franklin. Generating debug files is no bad thing - it allows focused validation and debugging, without polluting general-purpose analysis outputs. Unless these variables are likely to be useful for analysis, I would keep them where they are.
In general, I also agree that the PhaseIITreeMaker is not a good general-purpose analysis writer. It was originally a hack Tool to allow quick generation of ROOT plots. It now makes a tree with potentially 334 branches and counting. That's absurd! It has to be updated in-sync with every other Tool that adds or modifies what goes into it. The Tool, and its output files, will only grow as more Tools add to it. I don't personally consider it maintainable, or an appropriate way to work.
At minimum we could put a Tree in the DataModel and allow Tools to add branches to it, and have a simple Writer tool at the end of the Toolchain that calls TTree::Fill. That way at least each Tool remains responsible for its content, and the output file reflects what Tools were in the ToolChain, keeping content relevant to the analysis as per Franklin's request.
But of course the ANNIEEvent BoostStore is intended to be ANNIE's official output, and I'm really not sure why there's such an avoidance of BoostStores in favour of the PhaseIIColossalTreeMaker output instead....

@marc1uk
Copy link
Collaborator

marc1uk commented Mar 4, 2026

I don't mean to add further issues, and this is probably not the right place for it, but I can't help but notice that:

  • ClusterSearcherConfig
  • DigitBuilderConfig
  • MCRecoEventLoaderConfig

all have a config variable isMC. While setting 3 variables isn't an especially big burden, it would seem more sensible that the appropriate data reader Tool sets a flag in the DataModel and Tools check that, rather than requiring multiple flags be appropriately set for the given input files.

@jminock jminock removed their assignment Mar 5, 2026
…ctorConfig

-removed unhelpful forced loop end from single-pe gains extraction in ClusterSearcher::Initialize()
-changed Genie version in ClusterSearcher/LoadGenieEventConfig to 1
@flemmons
Copy link
Contributor Author

flemmons commented Mar 6, 2026

Regarding these [https://github.com//pull/365#issuecomment-4000879256] :
-I've changed the genie version variable to 1, which seems to be used in other toolchains. Although I will mention that my ToolAnalysis cannot find the files for these, so it can't run with this setting. I haven't checked the trueQ2 output. Seeing as how the only thing trueQ2 is used for is direct output to the file in NeutronCheck, I do not feel that this should hold up the PR any further.
-The charge balance constant is a known issue, but it's not as quick a fix as I'd like, since I'm hoping to integrate PMT masking, and make it reactive to the actual number of currently active PMTs. I will need to pass that information in from other tools. I will do this in a future update, but seeing as it's not ready to go, I'll request we let the status quo ride this PR into the next one.
-I have removed that line in the new commit.
-I'm not sure you used the correct link. What memory leak do you mean?

flemmons and others added 2 commits March 6, 2026 10:21
-replaced all instances of fClusterList with fRecoClusters in ClusterSearcher
-changed ClusterSearcher::RecoClusters() to a void function, since it was returning a member pointer to itself
-removed ClusterSearcher::SelectByClusters(), which is not used
@marc1uk
Copy link
Collaborator

marc1uk commented Mar 10, 2026

Thanks for the updates Franklin. That's got the vast majority of comments resolved, i think we're very nearly there.

The only one I'm still somewhat reticent to ignore is the '123' constant in the charge balance calculations.
Part of my concern is that I believed 123 to be a meaningless placeholder value that would give invalid outputs, but perhaps it's not that bad. If this is a hard-coded but correct number (based on a quick check of the HV page, ANNIE has ~119 tank PMTs active, so perhaps 123 before bad-channel masking is correct?), then that's a bit better.

While applying run-wise bad channel masking would be ideal, I'd still much prefer leaving that as a TODO but still fetching the value from geometry over hard-coding it. It feels like the distinction between "this is a simplification" and "this is sloppy", and it should be trivial to implement.

Aside from that it seems like there's a compilation error flagged by the latest CI build:

In member function 'void ClusterSearcher::RecoClusters(std::vector<RecoDigit*>*)':
UserTools/ClusterSearcher/ClusterSearcher.cpp:636:42:
error: no matching function for call to 'RecoCluster::SetDigits(std::vector<RecoDigit>&)' cluster.SetDigits(ClusteredDigits);
                                          ^
In file included from UserTools/ClusterSearcher/ClusterSearcher.h:9,
                 from UserTools/ClusterSearcher/ClusterSearcher.cpp:1:
include/RecoCluster.h:28:15: note: candidate: 'void RecoCluster::SetDigits(std::vector<RecoDigit>*)'
   inline void SetDigits(vector<RecoDigit>* indigits){fDigitList=indigits;}
               ^~~~~~~~~
include/RecoCluster.h:28:15: note:   no known conversion for argument 1 from 'std::vector<RecoDigit>' to 'std::vector<RecoDigit>*'
make[1]: [Makefile:203: UserTools/ClusterSearcher/ClusterSearcher.o] Error 1 (ignored)

@marc1uk
Copy link
Collaborator

marc1uk commented Mar 11, 2026

Reply on slack:

According to the original phase-II design specifications, 123 is one short of the ideal case with all PMT slots full.  I haven't dug deeply into why yet, but I assume it was instituted after one tube had failed/been removed/not been installed.  There's an empty sconce in the tank, the story of which I don't know.  The 123 was put in by Michael before I got involved, and since it doesn't make much intra-run difference to the measurement (it's largely just a correction so that if all the PMTs see exactly the same charge, the CB becomes 0 for the extreme case), it's been a low priority.  That said, I do plan to fix it when I shuffle in my other neutron identification updates in the next couple of weeks.

Ok, if it's got motivation and precedent, that's good enough for me. 👍

@marc1uk marc1uk merged commit b0e491f into ANNIEsoft:Application Mar 11, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants